Overview ΒΆ
Project Objectives ΒΆ
Route optimization helps in improving the efficiency of operations. With streamlined routes, drivers can complete more deliveries or service calls in less time. This increased productivity can lead to higher customer satisfaction and profitability. Route optimization also allows for better use of resources: it can help in effectively managing the fleet and reducing the need for additional vehicles or drivers. To meet this demand, many companies offer optimization solutions through apps and APIs that seamlessly integrate with business operations.
As a former data analyst intern in an e-commerce platform's logistics division, I developed a keen interest in implementing ML model in this field. I aim to delve into these mechanisms and try to solve some common problems with small set of data.
Techniques ΒΆ
- Dijkstra's algorithm: This algorithm is used to determine the shortest paths between nodes within a weighted graph. It's particularly effective for optimizing routing and navigation problems by identifying the most efficient routes between points.
- OR-Tools: OR-Tools is a comprehensive optimization package developed by Google. It encompasses a range of functions for setting up optimization environments, defining business constraints, and implementing various optimization strategies. It's a versatile tool used in fields like logistics, scheduling, and resource allocation to find optimal solutions to complex problems.
- Map visualization package: Folium is a powerful tool for visualizing geospatial data on interactive maps. It provides a user-friendly interface to create maps with custom markers, overlays, and layers, making it ideal for displaying geographical information and analyzing spatial relationships.
- Map information package: OSMnx is a package designed for retrieving, constructing, and analyzing street networks and other spatial data from OpenStreetMap. It's particularly useful for extracting detailed map information, such as street networks, building footprints, and elevation data, which can be leveraged for various geographical analyses and visualizations.
Common Routing Optimization Problems and Solutions ΒΆ
These represent several typical routing optimization challenges. Variations stem from these fundamentals but entail different and more intricate objectives.
Shortest Distance or Traveling Time ΒΆ
The most straightforward problem is to find the shortest route from point A to point B, with the objective function based on either geographical distance or travel time. This scenario does not include additional obstacles or requirements, such as passing through point C or starting or ending at a specific node.
Dijkstra's algorithm is commonly used to solve this problem. It works by selecting the unvisited vertex with the lowest distance, then calculating the distance through it to each unvisited neighbor and updating the neighbor's distance if it's smaller. This process continues until the shortest path from A to B is determined.
The algorithm and scenario can be implemented under the assumption that each location is connected by a direct road, disregarding red lights, stop signs, toll roads, and other obstructions. This simplification allows for a straightforward calculation of the shortest route based solely on geographical distance or travel time.
Here's an example of how the algorithm works in practice. Let's say we want to determine the shortest route from node 1 to node 9. The main loop of the algorithm operates as follows:
- Begin by visiting each node starting from the vertex with the smallest known distance from the start node. Initially, this will be the origin (node 1).
- For the current vertex being visited, examine its neighboring nodes that haven't been visited yet.
- Calculate the distance from each unvisited neighboring node to the start node. For instance, starting from node 1, we would calculate the distances from 1 to 2 as 2, from 1 to 3 as 5, and from 1 to 4 as 2.
- If the calculated distance is shorter than the known distance, update the shortest distance.
- Update the previous vertex for each updated shortest distance.
- Add the current vertex to the list of visited vertices. This loop continues until all vertices have been visited.
Vehicle Routing Problem ΒΆ
Vehicle routing problem is a generic name given to a whole class of problems concerning the optimal design of routes to be used by a fleet of vehicles to serve a set of customers. This becomes more complex as we include more factors like vehicle capacity and driver availability. For example, in a delivery company, it asks for a determination of a set of routes, S, (one route for each vehicle that must start and finish at its own depot) such that all customers' requirements and operational constraints are satisfied and the objective function is minimized or maximized.
Different variants of Vehicle Routing Problems include:
- Capacitated Vehicle Routing Problem: Vehicles have limited carrying capacity for the goods they must deliver.
- Vehicle Routing Problem with Time Windows: Delivery locations have specific time windows within which deliveries must be completed.
- Vehicle Routing Problem with Pickup and Delivery: Goods need to be transported from pickup locations to delivery locations.
- Vehicle Routing Problem with Profits: Vehicles are not obligated to visit all nodes; the objective is to maximize the total collected profits.
With different business constraints and background The objective function can also be very different depending on the particular application of the result but a few of the more common objectives are:
- Minimize the global transportation cost based on the global distance travelled as well as the fixed costs associated with the used vehicles and drivers
- Minimize the number of vehicles needed to serve all customers
- Least variation in travel time and vehicle load
- Minimize penalties for low quality service
- Maximize a collected profit/score.
Here are some optimization algorithm commonly used in solving vehicle routing problem:
Genetic Algorithms (GA): they search problems by mimicking the principles of evolution and natural selection observed in biological systems. They start with random strategies and solutions, assess their fitness, then allow the best ones to combine strategies, akin to how animals pass traits to offspring. Introducing randomness via mutation, like new traits arising in living things, adds diversity. This cycle repeats over generations, making solutions smarter and more optimized, similar to how species adapt over time to their environments.
Neighborhood search algorithms: these algorithms work by iteratively exploring the neighborhood of a current solution to find improved solutions. It explore small changes (neighborhoods) from your current solution and moving towards better solutions within that neighborhood. They're great for problems where you want to find the best solution by making incremental improvements step by step.
Simulated Annealing (SA): it is similar to neighborhood search algorithms, but it adds a "temperature" parameter. This temperature controls how much randomness is allowed in the search. Starting with a high temperature enables extensive exploration and random movements across the solution space. As the algorithm progresses through iterations, the temperature gradually decreases following a predefined cooling schedule. This cooling process makes the algorithm more deterministic, meaning it becomes less likely to accept worse solutions as it progresses. The algorithm stops either when it reaches the maximum number of iterations or when a termination condition is met.
Ant Colony Optimization (ACO): it mimics ants' pheromone-based communication to find optimal paths. Artificial ants randomly explore solutions, choosing paths based on pheromone levels and heuristics. Higher pheromone levels and shorter paths are favored. ACO updates pheromone information iteratively, using historical data to guide exploration, unlike neighborhood search algorithms that focus narrowly on immediate solutions.
Traveling Salesman Problem ΒΆ
The Traveling Salesman Problem (TSP) is a specific type of routing vehicle problem that focuses on finding the shortest possible route for a salesman who needs to visit a list of cities exactly once and return to the starting city. The goal is to find the most efficient route that allows the salesman to visit every city exactly once and return to the city where they started their journey. This problem is crucial in logistics, transportation planning, and optimizing travel routes.
It's important to note that as the number of cities increases in the Traveling Salesman Problem, the sheer number of possible routes grows exponentially, posing a significant challenge for finding the optimal solution. To address this challenge, a range of optimization algorithms are employed, including heuristic approaches like nearest neighbor algorithms, genetic algorithms, simulated annealing, and ant colony optimization. These algorithms employ diverse strategies to progressively enhance the route by analyzing subsets of cities and assessing potential routes based on factors such as total distance or time.
Traveling Purchaser Problem ΒΆ
The Traveling Purchaser Problem resembles the Traveling Salesman Problem but incorporates inventory availability along the route, in addition to distance and transportation costs. The optimization algorithms commonly used for vehicle routing problems are also applicable to solving this type of problem.
Dataset ΒΆ
A comprehensive collection of Starbucks locations worldwide, containing data on Street Address, Country, City, Longitude, and Latitude. For the sake of project simplicity, I will focus exclusively on stores located in San Francisco, CA. Below is a summary of the targeted store data.
Examples of Solving Each Problem ΒΆ
Import packages ΒΆ
## for data processing
import pandas as pd
import numpy as np
## for plotting
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
## interacitve map
import folium
from folium import plugins
## for simple routing
import osmnx as ox # street map data
import networkx as nx # graph object
## for advanced routing
from ortools.constraint_solver import pywrapcp
from ortools.constraint_solver import routing_enums_pb2
import warnings
# Ignore all warnings
warnings.filterwarnings("ignore")
df = pd.read_csv('dataset.csv')
df[(df["Country"]=='US') & (df["State/Province"]=='CA') & (df["City"]=='San Francisco')].info()
df = df[(df["Country"]=='US') & (df["State/Province"]=='CA')].groupby('City')['Store Number'].count().sort_values(ascending=False)
<class 'pandas.core.frame.DataFrame'> Index: 89 entries, 14694 to 14783 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Brand 89 non-null object 1 Store Number 89 non-null object 2 Store Name 89 non-null object 3 Ownership Type 89 non-null object 4 Street Address 89 non-null object 5 City 89 non-null object 6 State/Province 89 non-null object 7 Country 89 non-null object 8 Postcode 89 non-null object 9 Phone Number 87 non-null object 10 Timezone 89 non-null object 11 Longitude 89 non-null float64 12 Latitude 89 non-null float64 dtypes: float64(2), object(11) memory usage: 9.7+ KB
Preprocessing ΒΆ
# read data and filter out LA stores
df = pd.read_csv('dataset.csv')
df = df[(df["Country"]=='US') & (df["State/Province"]=='CA') & (df["City"]=='San Francisco')][
["City","Street Address","Latitude","Longitude"]
].reset_index(drop=True)
df = df.reset_index().rename(
columns={"index":"id", "Latitude":"y", "Longitude":"x"})
print(df.info())
df.head(3)
<class 'pandas.core.frame.DataFrame'> RangeIndex: 89 entries, 0 to 88 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 id 89 non-null int64 1 City 89 non-null object 2 Street Address 89 non-null object 3 y 89 non-null float64 4 x 89 non-null float64 dtypes: float64(2), int64(1), object(2) memory usage: 3.6+ KB None
| id | City | Street Address | y | x | |
|---|---|---|---|---|---|
| 0 | 0 | San Francisco | 865 Market St., #324 | 37.78 | -122.41 |
| 1 | 1 | San Francisco | 2018 Market Street | 37.77 | -122.43 |
| 2 | 2 | San Francisco | 120 4th Street | 37.78 | -122.40 |
# set up the depot (the place that truck / salesman starts traveling)
i = 74
df["base"] = df["id"].apply(lambda x: 1 if x==i else 0)
base = df[df["base"]==1][["y","x"]].values[0]
print("base =", base)
df.head(3)
base = [ 37.77 -122.43]
| id | City | Street Address | y | x | base | |
|---|---|---|---|---|---|---|
| 0 | 0 | San Francisco | 865 Market St., #324 | 37.78 | -122.41 | 0 |
| 1 | 1 | San Francisco | 2018 Market Street | 37.77 | -122.43 | 0 |
| 2 | 2 | San Francisco | 120 4th Street | 37.78 | -122.40 | 0 |
# plot out the distribution of x and y
plt.scatter(y=df["y"], x=df["x"], color="black")
plt.scatter(y=base[0], x=base[1], color="red")
plt.show()
# Plot stores on realistic map using folium package
# setup
def create_map():
data = df.copy()
color = "base" #color based on this column
lst_colors = ["black","red"]
popup = ["id"] #popup based on this column
lst_icon = ['glyphicon glyphicon-star-empty', 'glyphicon-home']
# base map
map_ = folium.Map(location=base, zoom_start=11)
# add colors
lst_elements = sorted(list(data[color].unique()))
data["color"] = data[color].apply(lambda x:
lst_colors[lst_elements.index(x)])
data["icon"] = data[color].apply(lambda x:
lst_icon[lst_elements.index(x)])
# add popup
data.apply(lambda row:
folium.Marker(
location=[row["y"],row["x"]], tooltip=f"id: {row[popup].values[0]}, base?: {row['base']}"
, fill=True, icon=folium.Icon(color=row["color"], icon=row["icon"])).add_to(map_),
axis=1)
# Add different versions of map
layers = ["cartodbpositron", "openstreetmap","CartoDB.Voyager"]
for tile in layers:
folium.TileLayer(tile).add_to(map_)
folium.LayerControl(position='bottomright',collapsed=False).add_to(map_)
return map_
# show
map_ = create_map()
map_
Shortest Path Problem ΒΆ
Creating Road Map ΒΆ
To get the shortest path from base to any location. We need to establish connections between these nodes using roads as links. OSMnx is a library that can query OpenStreetMap data, extract road details, convert them into links, and then transform this data into graphs.
The variable "G" in the code below represents a road network within a 20 km radius of the base point. This network aids in calculating the shortest travel time from the base to our destination. The graph visually represents the road network, with each circle denoting nodes or road intersections, and the lines indicating distance between nodes.
# create network graph
G = ox.graph_from_point(base, dist=50000,
network_type="drive") #'drive', 'bike', 'walk'
G = ox.add_edge_speeds(G)
G = ox.add_edge_travel_times(G)
# plot
fig, ax = ox.plot_graph(G, bgcolor="black", node_size=5,
node_color="white", figsize=(16,8))
Calculating Shortest Path ΒΆ
After obtaining the distance and time information, we can compute the shortest path from the origin (base) to the destination. In this case, I've set the destination to index 10. I experimented with two different objective functions: minimizing traveling distance and minimizing traveling time. Both optimizations were achieved using Dijkstra's algorithm.
The black and white map below illustrates the results of these shortest paths. The red line represents the optimal path when prioritizing traveling distance, while the blue line indicates the path with the shortest traveling time.
# Example: setting id = 10 as destination
end = df[df["id"]==10][["y","x"]].values[0]
print("locations: from", base, "--> to", end)
# Determine the nearest node on the road map to both our base store and destination store
base_node = ox.distance.nearest_nodes(G, base[1], base[0])
end_node = ox.distance.nearest_nodes(G, end[1], end[0])
print("nodes: from", base_node, "--> to", end_node)
# calculate shortest path using Dijkstraβs algorithm
## prioritizing distance
path_lenght = nx.shortest_path(G, source=base_node, target=end_node,
method='dijkstra', weight='lenght')
print(path_lenght)
## prioritizing time
path_time = nx.shortest_path(G, source=base_node, target=end_node,
method='dijkstra', weight='travel_time')
print(path_time)
# plot on the graph and compare two different objective functions results
fig, ax = ox.plot_graph_routes(G, routes=[path_lenght, path_time],
route_colors=["red","blue"],
route_linewidth=5, node_size=1,
bgcolor='black', node_color="white",
figsize=(16,8))
locations: from [ 37.77 -122.43] --> to [ 37.8 -122.43] nodes: from 65365163 --> to 65336112 [65365163, 65350533, 65332198, 65365166, 65327905, 65365168, 65310809, 65354073, 65325380, 65356398, 65287232, 65352349, 65308302, 65365172, 65365176, 65317971, 65338304, 258761269, 65314192, 65303573, 65319981, 65352464, 65295361, 65307440, 65336914, 65365195, 65337437, 65365236, 65333932, 65299426, 65345288, 65312706, 65312705, 65292764, 65292766, 65319050, 65319054, 65336112] [65365163, 65318023, 65328181, 65328183, 65328185, 65328187, 65327904, 65328191, 65310806, 65306336, 65328196, 65328198, 65325378, 65324344, 342543655, 65328203, 65287225, 65328209, 65308300, 65328216, 65328224, 65328240, 65317967, 258760438, 258760426, 258760428, 2163049313, 65314189, 65303568, 342544910, 65319976, 65328348, 65295355, 65307438, 65328349, 65328351, 65328354, 65328358, 65328360, 65292758, 65292759, 4011962210, 65292764, 65312700, 65312698, 65322710, 65336112]
We can further visualize the optimal route by plotting it back onto the Folium map we created earlier in the previous section.
# plot on the map
ox.plot_route_folium(G, route=path_lenght, route_map=map_,
color="red", weight=1)
ox.plot_route_folium(G, route=path_time, route_map=map_,
color="blue", weight=1)
map_
We can also simulate the driver's route using Plotly.
# Create route dataframe using the path that optimizing driving time (variable: path_time)
lst_start, lst_end = [],[]
start_x, start_y = [],[]
end_x, end_y = [],[]
lst_length, lst_time = [],[]
# Each row represents the route that the driver will pass between the nodes by sequence
for a,b in zip(path_time[:-1], path_time[1:]):
lst_start.append(a)
lst_end.append(b)
lst_length.append(round(G.edges[(a,b,0)]['length']))
lst_time.append(round(G.edges[(a,b,0)]['travel_time']))
start_x.append(G.nodes[a]['x'])
start_y.append(G.nodes[a]['y'])
end_x.append(G.nodes[b]['x'])
end_y.append(G.nodes[b]['y'])
route_simulate_df = pd.DataFrame(list(zip(lst_start, lst_end,
start_x, start_y, end_x, end_y,
lst_length, lst_time)),
columns=["start","end","start_x","start_y",
"end_x","end_y","length","travel_time"]
).reset_index().rename(columns={"index":"id"})
print(route_simulate_df.head())
## create start/end df
df_base = route_simulate_df[route_simulate_df["start"] == base_node]
df_end = route_simulate_df[route_simulate_df["end"] == end_node]
## create basic map
fig = px.scatter_mapbox(data_frame=route_simulate_df, lon="start_x", lat="start_y",
zoom=10, width=800, height=500,
animation_frame="id",
mapbox_style="carto-positron")
## add driver
fig.data[0].marker = {"size":12}
## add start point
fig.add_trace(px.scatter_mapbox(data_frame=df_base,
lon="start_x", lat="start_y").data[0])
fig.data[1].marker = {"size":15, "color":"red"}
## add end point
fig.add_trace(px.scatter_mapbox(data_frame=df_end,
lon="start_x", lat="start_y").data[0])
fig.data[2].marker = {"size":15, "color":"green"}
## add route
fig.add_trace(px.line_mapbox(data_frame=route_simulate_df,
lon="start_x", lat="start_y").data[0])
fig
id start end start_x start_y end_x end_y \ 0 0 65365163 65318023 -122.430084 37.770249 -122.429248 37.770355 1 1 65318023 65328181 -122.429248 37.770355 -122.428445 37.770457 2 2 65328181 65328183 -122.428445 37.770457 -122.428537 37.770927 3 3 65328183 65328185 -122.428537 37.770927 -122.428629 37.771391 4 4 65328185 65328187 -122.428629 37.771391 -122.428722 37.771849 length travel_time 0 74 7 1 72 6 2 53 4 3 52 4 4 52 4
Traveling Salesman Problem ΒΆ
Using the nodes and links on the roadmap, we can delve into solving routing problems that involve passing through multiple stops and encompass more complex business constraints. I suggest beginning with the Traveling Salesman Problem, as it simplifies the business settings with just one loop and a single driver. Our goal is to minimize travel time as the objective function.
Initially, I create a distance matrix that includes the shortest travel times between nodes representing the stores. Subsequently, I employ OR-Tools to solve the problem. OR-Tools is a Python package developed by Google for solving linear programming, vehicle routing, and various related optimization problems.
Creating Distance Matrix ΒΆ
I map the nodes from the roadmap back to the store information table and remove duplicate nodes. Additionally, I fill the null values in the distance matrix with row means and overall means.
# get the node from the roadmap for each store location
df_salesman = df.copy()
df_salesman["node"] = df_salesman[["y","x"]].apply(lambda x:
ox.distance.nearest_nodes(G, x[1], x[0]),
axis=1)
# some stores might share the same nodes
# drop duplicate nodes so that we won't pass by same location multiple times
df_salesman = df_salesman.drop_duplicates("node", keep='first')
df_salesman.info()
<class 'pandas.core.frame.DataFrame'> Index: 38 entries, 0 to 85 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 id 38 non-null int64 1 City 38 non-null object 2 Street Address 38 non-null object 3 y 38 non-null float64 4 x 38 non-null float64 5 base 38 non-null int64 6 node 38 non-null int64 dtypes: float64(2), int64(3), object(2) memory usage: 2.4+ KB
# Create distance matrix for later optimization inputs
## distance length function
def f(a,b):
try:
d = nx.shortest_path_length(G, source=a, target=b,
method='dijkstra',
weight='travel_time')
except:
d = np.nan
return d
## apply the function
distance_matrix = np.asarray([[f(a,b) for b in df_salesman["node"].tolist()]
for a in df_salesman["node"].tolist()])
distance_matrix = pd.DataFrame(distance_matrix,
columns=df_salesman["node"].values,
index=df_salesman["node"].values)
distance_matrix.head()
| 65319536 | 65365163 | 65335256 | 65308300 | 4062847020 | 65307198 | 65298939 | 258911631 | 6311883023 | 1723739241 | ... | 65352339 | 65373364 | 65342609 | 65292743 | 65322686 | 65317190 | 65350725 | 6424341847 | 65342677 | 65288870 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 65319536 | 0.0 | 189.3 | 104.3 | 191.5 | 716.5 | 722.5 | 518.0 | 558.7 | 185.6 | NaN | ... | 112.0 | 212.6 | 473.0 | 279.7 | 260.6 | 422.7 | 553.5 | 214.6 | 567.9 | 380.3 |
| 65365163 | 211.7 | 0.0 | 265.4 | 90.8 | 555.0 | 578.7 | 380.7 | 438.4 | 281.1 | NaN | ... | 155.3 | 381.4 | 375.0 | 186.2 | 345.2 | 324.7 | 394.2 | 310.2 | 424.1 | 228.4 |
| 65335256 | 144.2 | 258.7 | 0.0 | 299.7 | 677.7 | 821.7 | 479.2 | 664.6 | 85.0 | NaN | ... | 240.1 | 195.8 | 504.6 | 318.4 | 279.2 | 433.0 | 514.7 | 114.4 | 667.1 | 441.9 |
| 65308300 | 199.9 | 106.2 | 261.6 | 0.0 | 659.2 | 609.9 | 483.9 | 394.2 | 335.4 | NaN | ... | 74.4 | 348.0 | 284.2 | 95.4 | 254.4 | 233.9 | 498.4 | 364.5 | 455.3 | 321.0 |
| 4062847020 | 728.0 | 554.8 | 727.4 | 638.1 | 0.0 | 267.6 | 422.3 | 492.2 | 695.9 | NaN | ... | 702.6 | 854.4 | 728.6 | 733.5 | 892.5 | 787.5 | 265.2 | 663.8 | 299.4 | 396.5 |
5 rows Γ 38 columns
# Check null values
heatmap = distance_matrix.copy()
for col in heatmap.columns:
heatmap[col] = heatmap[col].apply(lambda x:
0.3 if pd.isnull(x) else #nan -> purple
(0.7 if np.isinf(x) else #inf -> orange
(0 if x!=0 else 1) )) # 0 -> white
fig, ax = plt.subplots(figsize=(10,5))
sns.heatmap(heatmap, vmin=0, vmax=1, cbar=False, ax=ax)
plt.show()
# Fill null values with row mean and total mean
# fillna with row average
distance_matrix = distance_matrix.T.fillna(distance_matrix.mean(axis=0)).T
# fillna with overall average
distance_matrix = distance_matrix.fillna(distance_matrix.mean().mean())
# Check null values after filling na
heatmap = distance_matrix.copy()
for col in heatmap.columns:
heatmap[col] = heatmap[col].apply(lambda x:
0.3 if pd.isnull(x) else #nan -> purple
(0.7 if np.isinf(x) else #inf -> orange
(0 if x!=0 else 1) )) # 0 -> white
fig, ax = plt.subplots(figsize=(10,5))
sns.heatmap(heatmap, vmin=0, vmax=1, cbar=False, ax=ax)
plt.show()
Calculting TSP Optimal Path ΒΆ
First, establish the business parameters, noting that we have only one driver available. Subsequently, initialize a routing index manager with details about the nodes, drivers, and base node. This manager is then used to create a routing model, which serves as the basis for applying optimization algorithms to determine optimal routes for vehicles or comparable routing scenarios.
# Business settings
drivers = 1
lst_nodes = df_salesman["node"].tolist()
print("start:", base_node, "| tot locations to visit:",
len(lst_nodes)-1, "| drivers:", drivers)
start: 65365163 | tot locations to visit: 37 | drivers: 1
manager = pywrapcp.RoutingIndexManager(len(lst_nodes),
drivers,
lst_nodes.index(base_node))
model = pywrapcp.RoutingModel(manager)
Next, set up a callback function (get_distance) to dynamically compute distances between nodes based on a distance matrix. It then registers this callback with the routing model and configures the model to use this callback to evaluate arc costs for all vehicles during the optimization process.
# Get the traveling time from node to node
def get_distance(from_index, to_index):
return distance_matrix.iloc[from_index,to_index]
distance = model.RegisterTransitCallback(get_distance)
model.SetArcCostEvaluatorOfAllVehicles(distance)
Then, set up routing search parameters with a specific first solution strategy, which influences how OR-Tools explores the search space and finds an initial solution to the routing problem. Here, we use "PATH_CHEAPEST_ARC", a strategy that prioritizes the selection of the most economical edge at each stage of route construction. This strategy is similar the the nearest neighbor search algorothm which primarily emphasizes information about nearby roads or edges.
# Searching strategy settings1
parameters = pywrapcp.DefaultRoutingSearchParameters()
parameters.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC # select the cheapest edge at each search
)
Finally, we can start running the solver. The model.SolveWithParameters(parameters) initiates the optimization process to find an initial solution based on the strategy. Then, it iterates through the solution to construct the route for the driver, starting from the first node and moving along the path until reaching the end.
The loop retrieves the node indices along the route, calculates the total distance traveled using the get_distance function, and updates the route distance accordingly. Finally, the code prints out the route indices, the total time taken in hours, and the number of nodes visited in the route.
# PATH_CHEAPEST_ARC solution
solution = model.SolveWithParameters(parameters)
index = model.Start(0)
print('Route for driver:')
route_idx, route_distance = [], 0
while not model.IsEnd(index):
route_idx.append( manager.IndexToNode(index) )
previous_index = index
index = solution.Value( model.NextVar(index) )
### update distance
try:
route_distance += get_distance(previous_index, index)
except:
route_distance += model.GetArcCostForVehicle(
from_index=previous_index,
to_index=index,
vehicle=0)
print(route_idx)
print(f'Total Time: {round(route_distance/3600, 2)} hours')
print(f'Nodes visited: {len(route_idx)}')
Route for driver: [1, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 0] Total Time: 4.26 hours Nodes visited: 38
Likewise, we can overlay the optimal route onto the map generated in the preceding section. The blue line on the map signifies the driver's path to visit all stores. Although we have the option to use Plotly for simulating the trip, I prefer to omit that step for the sake of simplicity.
# Map the salesman's route back to the map
## Get path between nodes
def get_path_between_nodes(lst_route):
lst_paths = []
for i in range(len(lst_route)):
try:
a, b = lst_nodes[i], lst_nodes[i+1]
except:
break
try:
path = nx.shortest_path(G, source=a, target=b,
method='dijkstra',
weight='travel_time')
if len(path) > 1:
lst_paths.append(path)
except:
continue
return lst_paths
lst_paths = get_path_between_nodes(route_idx)
map_ = create_map()
# Add paths on the map
for path in lst_paths:
ox.plot_route_folium(G, route=path, route_map=map_,
color="blue", weight=1)
map_
Vehicle Routing Problem ΒΆ
Lastly, I would like to deal with Vehicle Routing Problem which involve more constraints and more complexity. Here, I would like to introduce limitations for the drivers in both carrying capacity and coverable distance. In this example, we can assume that we have 5 drivers, each capable of driving for 8 hours per day and restricted to visiting a maximum of 15 stores daily. Similarly, we also want to minimize total traveling time.
Calculating VRP Optimal Path ΒΆ
General SettingsΒΆ
First, specify the number of drivers available, the capacity of each driver, the total demand (total number of nodes to be visited), and the maximum allowable travel time.
## Business parameters and constraints
drivers2 = 3
driver_capacities = [13,13,13]
demands = [0] + [1]*(len(lst_nodes)-1) # different store could have different demand but here we assume all = 1
max_time = 2 * 60 * 60 # drive 3 hours a day
Similarly, we create a manager object to track progress, a model to establish the simulation environment, and a distance function for computing costs.
## new manager and model
manager2 = pywrapcp.RoutingIndexManager(len(lst_nodes),
drivers2,
lst_nodes.index(base_node))
model2 = pywrapcp.RoutingModel(manager2)
## add traveling time (cost)
def get_distance(from_index, to_index):
return distance_matrix.iloc[from_index,to_index]
distance2 = model2.RegisterTransitCallback(get_distance)
model2.SetArcCostEvaluatorOfAllVehicles(distance2)
Setting ConstraintsΒΆ
Different from the traveling salesman problem, we need to add more contraints to the model.
## add capacity constraint
def get_demand(from_index):
return demands[from_index]
demand = model2.RegisterUnaryTransitCallback(get_demand) # provide the capacity requirement for each node (store)
model2.AddDimensionWithVehicleCapacity(demand, slack_max=0, # Sets the maximum extra capacity for each vehicle.
vehicle_capacities=driver_capacities,
fix_start_cumul_to_zero=True, # Cumulative demand starts at zero for each vehicle's route
name='Capacity')
# add traveling time constraint
name = 'Distance'
model2.AddDimension(distance2, slack_max=0, capacity=max_time, # restrict the maximum distance traveled by each vehicle
fix_start_cumul_to_zero=True
, name=name)
distance_dimension = model2.GetDimensionOrDie(name)
# sets a cost coefficient for the distance dimension
# influence how the model prioritizes minimizing travel distances while optimizing routes
distance_dimension.SetGlobalSpanCostCoefficient(100)
Here I use the same policy as Traveling Salesman Problem, choosing the cheapest edge near the node.
# set strategy to minimize traveling time
parameters2 = pywrapcp.DefaultRoutingSearchParameters()
parameters2.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC
)
solution2 = model2.SolveWithParameters(parameters2)
After setting the environment and constrains, we can start solving the problem. It iterate through each driver's route, starting from their designated starting point and ending when the driver completes their route. Within the loop, the code tracks the route indices, distance traveled, and load carried by each driver. The get_distance function or the model's method model2.GetArcCostForVehicle is used to calculate the distance traveled between nodes, and the demands dictionary is accessed to determine the load carried by each node.
Finally, we can get the route indices, traveling time, and load for each driver, as well as the total traveling time and load for all drivers combined.
dic_routes_idx, total_distance, total_load = {}, 0, 0
for driver in range(drivers2):
print(f'Route for driver {driver}:')
index = model2.Start(driver)
route_idx, route_distance, route_load = [], 0, 0
while not model2.IsEnd(index):
node_index = manager2.IndexToNode(index)
route_idx.append( manager2.IndexToNode(index) )
previous_index = index
index = solution2.Value( model2.NextVar(index) )
### update distance
try:
route_distance += get_distance(previous_index, index)
except:
route_distance += model2.GetArcCostForVehicle(
from_index=previous_index,
to_index=index,
vehicle=driver)
### update load
route_load += demands[node_index]
route_idx.append( manager2.IndexToNode(index) )
print(route_idx)
dic_routes_idx[driver] = route_idx
print(f'traveling time: {round(route_distance/3600,2)} hours')
print(f'load: {round(route_load,2)}', "\n")
total_distance += route_distance
total_load += route_load
print(f'traveling time: {round(total_distance/3600,2)} hours')
print(f'Total load: {total_load}')
Route for driver 0: [1, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1] traveling time: 1.19 hours load: 11 Route for driver 1: [1, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 1] traveling time: 1.36 hours load: 14 Route for driver 2: [1, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 0, 1] traveling time: 1.42 hours load: 14 traveling time: 3.97 hours Total load: 39
We can also visualize the path in the map.
# Convert from idx to nodes
dic_route = {}
for k,v in dic_routes_idx.items():
print(f"Route for driver {k} (nodes):")
dic_route[k] = [lst_nodes[i] for i in v]
print(dic_route[k], "\n")
# Get path between nodes
dic_paths = {k:get_path_between_nodes(v) for k,v in dic_route.items()}
map_ = create_map()
# Add paths on the map
lst_colors = ["green","red","blue"]
for k,v in dic_paths.items():
for path in v:
ox.plot_route_folium(G, route=path, route_map=map_,
color=lst_colors[k], weight=round(5/(k+1),0), opacity=0.5)
color=lst_colors[2], weight=0.5, opacity=0.5)
map_
Route for driver 0 (nodes): [65365163, 6311848840, 65336112, 1723739241, 6311883023, 258911631, 65298939, 65307198, 4062847020, 65308300, 65335256, 65365163] Route for driver 1 (nodes): [65365163, 434685829, 65319946, 65283764, 65303726, 11834834867, 65287649, 65340480, 65336921, 65312219, 65299283, 65327169, 65290299, 65340931, 65365163] Route for driver 2 (nodes): [65365163, 65288870, 65342677, 6424341847, 65350725, 65317190, 65322686, 65292743, 65342609, 65373364, 65352339, 65308466, 65298596, 65321070, 65319536, 65365163]